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CRITICAL POINTS VIA MONODROMY AND LOCAL METHODS 


ABRAHAM MARTIN DEL CAMPO AND JOSE ISRAEL RODRIGUEZ 


Abstract. In many areas of applied mathematics and statistics, it is a fundamental problem 
to find the best representative of a model by optimizing an objective function. This can be done 
by determining critical points of the objective function restricted to the model. 

We compile ideas arising from numerical algebraic geometry to compute the critical points 
of an objective function. Our method consists of using numerical homotopy continuation and 
a monodromy action on the total critical space to compute all of the complex critical points of 
an objective function. To illustrate the relevance of our method, we apply it to the Euclidean 
distance function to compute ED-degrees and the likelihood function to compute maximum 
likelihood degrees. 


1. Introduction 

In science and engineering, it is common to work with models that can be described as the 
solutions of a parametrized system of polynomial equations. For such algebraic models X C M n , 
we are interested in the following polynomial optimization problem: given u € M n , find those 
points x* E X that optimize an objective function \I >(x,u ) : M n x M n -A M. Common objective 
functions seen in applications include norms, distances, and other statistical functions. In this 
paper, we focus on the case when 'F is a quadratic norm (such as the Euclidean distance) and 
when \F is the likelihood function. 

By Fermat’s theorem, we find x* by computing the set of critical points using the method 
of Lagrange multipliers. Thus, we are interested in finding the points x € X for which the 
gradient V\F(x, u) is orthogonal to the tangent space T x X of X at the point x. We recast these 
constraints as the solutions to a system of polynomial equations. In the algebraic closure, the 
number d of critical points is an invariant that gives information about the algebraic complexity 
of the optimization problem. Finding the number d for different objective functions and models 
is a challenging problem. When \F is the Euclidean distance, the number d is called the Eu¬ 
clidean distance degree (ED-degree), and when \F is the likelihood function, the number d is the 
maximum likelihood degree (ML-degree). 

Finding these degrees d is an active area of research. There are already some fundamental 
results about these algebraic degrees and the geometry behind them. A general degree theory 
for the ED-degree was introduced in [7] and for the ML-degree in HETUE]. For some special 
cases, it is possible to find formulas for d using techniques from algebraic geometry E using 
EH |23l EH [251 127] , some are summarized in |26| . However, this is not always the case, and other 
methods are necessary. 

There are algorithms proposed for identifying critical points based on local methods. Such 
methods involving Grobner bases include mmm- From the numerical algebraic geometry 
perspective, those in |32j [2] are can be used and involve SVD decomposition and regeneration 
(respectively) to find critical points for the computation of witness sets. For computing the real 
critical points, one method is to compute all complex solutions and determine the real solutions 
among them. This can be avoided by using local methods to determine local optima and using 
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heuristics to decide if the local optima is in fact global. In these cases, one can study the expected 
number of real critical points when u is drawn from a given probability distribution. This has 
been studied for the ED-degree under the name of average ED-degree (e.g. 1). Recently, a 
probabilistic method was proposed in [28], where they classify the real critical points for the 
likelihood function. 

In this paper, we propose a different numerical method to compute all the complex critical 
points. Our method consists of two parts and complements those used in p~2l IT5] . The first 
part consists of exploiting a monodromy group action to randomly explore the variety of critical 
points. In this random exploration, we determine a subset of critical points. The second part 
consists of a trace test to determine, with probability one, if the computed subset contains all 
of the critical points. 

This paper is organized as follows. In the next section, we give a geometric formulation 
of the critical equations along with a concrete formulation of the critical equations. In the 
following section, we review monodromy for a parameterized polynomial system. This includes 
a description of each part of our method, including a trace test. We conclude by using these 
methods to reproduce some known ED-degrees and ML-degrees, and compare the times of 
those computations against our method. Throughout the paper, we include implementation 
subsections so the reader can use the methods in their own research. Supplementary materials 
can be found on the second authors website at http://www3.nd.edu/~jrodril8/monodromy. 

We end this introduction with an illustrating example to the critical points problem. 


Example 1.1. Let X be the ellipse defined as the algebraic variety of the points x = (xi,x 2 ) G 
M 2 that satisfy the polynomial equation 

1744^ - 2016xiX2 - 2800xi + 1156x| + 2100x 2 + 1125 = 0. 

For a (generic) choice of u = (rti,rt 2 ) G M 2 , we are interested in the optimization problem 

(1.1) min T(x, u) = (xi — u \) 2 + (x 2 — tt 2 ) 2 subject to x G X. 


Here, the objective function 'I'(x,n) is the square of the Euclidean distance between the (fixed) 
point u G M 2 and the ellipse X. The critical points of (1.1) are those points x* G X whose 
tangent is perpendicular to the line segment joining x* and u. This is illustrated in Figure [lj 
By using Lagrange multipliers, finding the critical points corresponds to finding the solutions to 



Figure 1. Critical points for the distance between the point u = (0.75,-0.29) 
and the ellipse 1744xf — 2016xix 2 — 2800xi + 1156x| + 2100x 2 + 1125 = 0 
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the following polynomial system 

1744x?-2016xiX2 - 2800xi+ 1156x1+ 2100x2 + 1125 = 0 
(1.2) xi - m + A (3488xi - 2016x 2 - 2800) = 0 

X 2 — U 2 + A (—2016xi + 2312x2 + 2100) = 0. 

where xi,X 2 , and A are indeterminates, and u\,U 2 are parameters. For instance, when u = 
(0.75, —0.29), we use tools from numerical algebraic geometry to find the set of critical points, 
which consist of the points 


(0.8444,-0.3330), (0.8329,-0.0066), (0.5985,-0.0941), and (0.2529,-0.8140). 


These are the coordinates on xi and X 2 of the solutions to (1.2). In this case, there are only four 
solutions, showing that the ED-degree for X is 4. The solution to the optimization problem (1.1) 
corresponds to the critical point with the smallest Euclidean distance to u, which in this case is 
the first point. 


2. Critical points 

In this section, we will introduce critical points of + restricted to an affine variety X defined 
by a system of polynomials in the indeterminates xi,..., x n . To a given variety X E C n and a 
function T, we associate its total critical variety Y E C n x C n that consists of all the critical 
points of + in X. This critical variety is defined by a system of polynomial equations in two 
sets of indeterminates: x\,... ,x n and u\,... ,u n . We refer to xi,...,x n as the indeterminates 
of the model and to u\, ..., u n as the parameters for 

We let the system of polynomials F = {/i,..., f m } in the indeterminates xi,..., x n define 
an affine variety X of C n . We call this variety the model. In engineering and statistics, we are 
interested in the real points of the variety X. However, we work over the complex numbers and 
then restrict to the real numbers, as is usual in applied algebraic geometry. In this way, the 
monodromy methods that we discuss in the next sections will let us conclude properties about 
X that can be used to understand its real counterpart. 

The objective functions + we consider are the Euclidean norm and the likelihood function, 
but our method holds also for functions T(x, u) : M n x W 1 -A M with a derivative (or logarithmic 
derivative) that is a rational function in xi,..., x n . We regard T as a family of functions \k u 
parametrized by a G C". 

Let JacX denote the Jacobian of X, which is the nxm- matrix whose (i,j)th entry is given 
by dfj/dxi. Let V/j denote the gradient vector of fj : thus, we can write 

j&cx = [ v/7" ••• v/T], 

The rank of the Jacobian of X evaluated at a point x in X is at most the codimension of X. 
We say the point x* in X is regular if the Jacobian at x* has rank equal to the codimension of 
X ; otherwise, the point x* is said to be singular. We let X reg denote the set of regular points 
of X and let X s i ng denote the set of singular points of X. Thus, we have X = X reg U X s i ng . 

If V T denotes the (truncated) gradient vector whose entries are the partial derivatives <9\k /dxi 
for alii = 1 then, a critical point of 4/ on X can be defined with respect to the Jacobian of 

X and V4L Let H be the set of points in C n where any denominator of the rational coordinates 
of VT vanishes. 

Definition 2.1. A point x E X is a critical point of T if and only if 
(2.1) x E X reg \H and rank [ VT t V/7" V/^ ] < m. 
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When 

^ u± U2 t 


is the likelihood function ^(x,u) = xf' • • • x“ n , the gradient VJ^x, it) equals 




, thus, the associated H is the solution set defined by X 1 X 2 1 

'X’Tl ' 

the quadratic norm \I >(x,u ) = (xi — u\) 2 + (x 2 — U 2) 2 + ■ ■ ■ + (x n — u n ) 2 , 
VT(x, u) = (x'i — u\, X 2 — U 2 ,... ,x n — u n )\ thus, H is the empty set. 

Our problem consists of Ending those x* such that, 

(2.2) min /max v h M (x) subject to x E X\H, for fixed u E M n . 


• • x n = 0. For 
the gradient is 


Note that because the conditions (2.1) are additive with respect to irreducible components we 
can assume that X is an irreducible variety. We regard the objective function $ as a function 
depending on a fixed parameter u E M n ; thus, we interpret it as a family of objective functions \k n 
parametrized by u. Therefore, we are interested in the set of critical points for the parametrized 
family of objective functions T u . 

Definition 2.2. Let X be an irreducible variety and be a parametrized family of objective 
functions as above. We define the total critical variety Y as the Zariski closure of the set of 


critical points of 
(2.3) 


in X reg \H, this is 


Y := {(x,u) E C" x C" : 1 E X reg \H and x is a critical point for +„}. 

The total critical variety has a natural projection to C n associated to the coordinates u 1 ,..., u n . 
When is the likelihood function, Y is called the likelihood correspondence, and the degree of 
this projection over a general point is called the maximum likelihood degree (ML degree) of X. 
When is the Euclidean distance function, Y is called the EDcorrespondence, and the degree 
of this projection over a general point is called the Euclidean distance degree (ED degree) of X. 

The constraints (2.1) impose algebraic equations that exhibit Y as an algebraic variety. Work¬ 
ing in the polynomial ring C[xi,..., x n , u\, ..., u n \, we let 1(X) denote the prime ideal defining 
X, let 1(H) denote the ideal of H, and let J(X, T) denote the (m+l)-minors of the matrix 

[W T V/T ••• V/T], 

We regard VT as the vector whose entries are only the partial derivatives (9\k jdxi for i = 1,..., n. 
Thus, the defining equations for Y are the generators of the ideal 

(2.4) (X(X) + J(X, fr)) : (1(H) • (minors(m, JacX)))°°. 

These equations form an overdetermined system of equations. However, since our techniques 
use numerical algebraic geometry tools, we would prefer to define the total critical variety by a 
square system of equations where the number of indeterminants equals the number of equations. 

One way to do this is to introduce auxiliary unknowns and consider a variety Y that is an 
irreducible component of a variety Z where Y projects to Y. We use Lagrange multipliers 
to define a squared systems as follows. For a fixed u E C n , consider the polynomial system 
G : C n X P m —X C m+n given by 

F(x) 

\ 0 VV u (x) T + Ai V/r(x) T + • • • + A m V/ m (x) T 

Note that if x* El satisfy the conditions (2.1), then there exists A* E P m such that G(x*,X*) = 0 
by the Fritz John condition m- In the affine patch where Ao = 1, the system G becomes a 
square system and its solutions (x*,A*) project to critical points x* E X reg . We will use the 
system (2.5) when we refer to the square system defining the total critical variety Y. 

Example 2.3. Consider X in C 4 defined by the equations /1 = X 1 X 3 — x 2 , f 2 = X 2 X 4 — x 2 , fy = 
X 1 X 4 — X 2 X 3 and the objective function = (xi — u\) 2 + • • • + (X 4 — U4) 2 . The codimension of 
X is 2. Let F = {/i,/ 2 } be the following linear combinations of the fy. 


(2.5) 


G(x, A) : = 


fi = ^( 2 /i+ 3 / 2 + 5/3) 


h = -!^(7/i + 11/2 + 13/3) 
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The variety defined by F is reducible but still contains X as an irreducible component. The 
additional component X' is the 2-dinrensional linear space defined by 9 x 2 — X 3 — 16 x 4 = 
81xi — 145x3 — 16 x 4 = 0. This randomization procedure of making a regular sequence from 
an overdetermi ned s ystem of equations is a standard tool of numerical algebraic geometry. The 
equations from (2.5) define a reducible variety Z in C 2 xC 4 xC 4 . This is a system of 6 equations 


in 10 indeterminants xi, X2, X3, X4, Ai, A 2 , tti, U2,113, U4 given by 

fi = 0 , h = 0 ^ 

(xi - ui) + Ai||- + A 2 || = 0 


(x 4 - “4) + + A 2 |^ - 0. 

This variety is reducible with two components Y and Y'. The component Y and the total critical 
variety Y of X are birationally equivalent, and the same is true for the second component Y' 
and the total critical variety Y' of the linear space X'. 

The fiber of the projection of Y and thus Y to C 4 associated to the coordinates ui,U 2 ,us, 114 
over the point u* = (|, — |, |, |) is described by the 6 equations above and the 4 equations below: 

u\ = 2/5, U 2 = —2/7, U 3 = 5/6, 114 = 3/7. 

Solving the system of 10 equations, we find 7 solutions. Six of them have x coordinates in X 
and the last solution has x coordinates in X'. For this example, there are only 3 real critical 
points and their x coordinates are listed below. The first two points lie in X while the last point 
is in X'. 

X 3 X 4 *„(x) 

.496407 .975616 .776241 

.0130495 .00246721 .981488 

.384257 -.186399 .642893. 


Xl X2 

.128515 .252579 

.365062 .0690207 

.651048 -.288682 


The first and last points in the list are the closest points from u* in X and X' respectively. 
The following theorem justifies that the ML-degree and ED-degree are well defined. The corre¬ 
sponding proofs can be found in [8] and |7j respectively. 

Theorem 2.4. The total critical variety Y is an irreducible variety of dimension n inside 
C n x C n and there exist open sets where the second projection p : Y —>• C n is generically finite 
and dominant. 

With our monodromy tec hniqu es, we were able to compute the 6 points in the fiber of u* 
from p : Y —> C 4 in Example 2.3 These techniques have the advantage of ignoring the critical 
point on junk components (in this case X'). The novelty of this paper is the compilation of 
ideas arising from numeral algebraic geometry applied to the computation of critical points. In 
addition, we have developed an implementation of monodromy homotopies that can be used 
to compute fibers of projections. This implementation uses the numerical algebraic geometry 
software Bert ini [4j and commutative algebra software Macaulay2 |11| . In particular, it involves 
the software packages Bertini.M2 [3] and NAGtypes [22] . 

3. Monodromy and general methods 

In this section we define the action of a monodromy group on the fiber of the projection of the 
total critical variety Y. This action is one of the main tools we use to compute critical points. 
We start by considering a polynomial system F = {/ 1 ,..., f m } that defines an irreducible affine 
variety X of codimension k and an objective function \k. The variety X and the function T 
determine the total critical variety Y from Definition 2.2 The projection n : Y —> X C C n 
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given by (x,u) —> x realizes Y as an affine vector bundle of rank k over X reg . For the second 
projection p : Y —>• C n defined by (x, u ) i —> u, there is an open set U C C n where the following 
fiber diagram holds 


(3.1) 


p~ 1 {U)‘ 
P , 

U - 




p 


with the restriction of p onto U being a generically finite morphism of degree d. A loop in U 
based at u has d lifts to p _ 1 (?7), one for each point in the fiber p~ l {i t). Associating a point 
in the fiber p~ 1 (u) to the endpoint of the corresponding lift gives a permutation in Sd- This 
defines the usual permutation action of the fundamental group of U on the fiber p~ l (u). The 
monodromy group of the map p : Y —> C n is the image of the fundamental group of U in Sd- 

The equations of (2.5) define a reducible variety Z which contains an irreducible component 
Y which is birationally equivalent to Y. Both Z and Y lie on C m x C n x C n , and the projection 
p : Y —> C n to the u coordinates factors with the projection p : Y C n . This factorization is 
compatible with the monodromy action on the fiber p~ k (u) over a regular point u. 

The idea behind the method is to use numerical homotopy continuation to compute the fiber 
at a regular point u* G U. Suppose that we are endowed with a point ( \q,Xq,Uq ) in the fiber 
p~ k (u o). We generate a random loop 7 : [0,1] -A U based at u, meaning 7 ( 0 ) = 7 ( 1 ) = u* Q . We 
numerically follow the points in the fibers p~ l { y(i)) as t deforms from 0 to 1. This computes 
a lift of 7 to a path from Xq to another point x\ in the fiber ]5 -1 (it q). Computing sufficiently 
many of these random loops enable us to recover the fiber. 


3.1. Populating the fiber. To describe our method, we start by discussing its components 
and the numerical tools they use. Suppose for a general u we have a critical point xq for 
Then, we create a random loop 7 based on u in the parameter space, and we use parameter 
homotopies to track the path from the point xq to another point x\ in the fiber p~ l (u). We now 
describe the details of this part. 

Let cf)(x\u ) : C N X C k —> C N be a parametrized family of polynomial equations 




$1 (■£ 1) • • ■ j Xn 1 Ul , . . . , Ufc) 

4>n(x 1 ,... ,xn ; ui,... ,Uk) 


= 0 . 


In other words, u) is a system of N polynomials in the variables x G and parameters 
u G C fc . Thus, for a fixed choice of it G C fc , we have a squared system. The number of 
nonsingular (isolated) solutions of 4 >(x, u) = 0 remains constant for general choices of parameter 
values a £ C fc , Parameter homotopy consists of tracking known solutions of the system cj)(x, u ) 
for specific parameter values u G C k to find solutions of the system for other parameter values 
u' G C k . For a fixed u G C fc , a path between u and u' is a continuous function 7 (t) : [0,1] —> C k 
such that 7 ( 0 ) = u and 7 ( 1 ) = v!, for which 7 (t) G C k stays generic for t G [0,1). If S is the set 
of solutions of (j)(x ; u ) for a generic choice of parameters u G C fc , a parameter homotopy consists 
of choosing a path 7 (t) between u and a generic u', and follow the solutions of the homotopy 


h{x,t) := 0 (x, 7 (t)) = 0 


starting at S as t goes from 0 to 1. In particular, if 7 is a loop based at u, then h(x,t) defines 
a path between the solution set S, which we call a monodromy path. 

We refer the reader to 13 EDI for more about parameter homotopies and we focus on its 
application to our problem. The parametrized system cj) that we consider is the one defined in 
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the affine chart where Aq = 1 by G(x, A; it) = 0 as in 


G(x, A; it) := 


V^(x;n) T + AiVFi(x) 1 -\ -b X m XF m (x) 


F(x) 


a system of n+m polynomials in the variables xi,... , x n , Aj,..., X m and parameters u\,... ,u n . 
We assume that for a given choice of parameters u £ C n , we know a critical point x* £ C n . In 
Section 34 we discuss some possible ways to find the first critical point x* for a given choice 
of parameter values u £ C n ; for now, we focus on the way we use parameter homotopies to find 
the rest of the critical points. 

Let F : C n —> C m be a polynomial system defining our model and let S denote its complete 
set of critical points. By letting A* = (0,..., 0) £ C m , the system G vanishes at (x*,A*), so 
let So := {(x*, A*)} be the starting solution set. We generate a loop based on u by taking two 
random parameter points u\ u" £ C n and we create a triangular loop u —» v! -» u" -A u using 
linear paths of the form t • u" + (1 — t) ■ u'. We track the solution (x*,A*) to a new solution 
(a/, A 7 ) of G(x,X]u). In this way, if S r C S is a set of solutions obtained by repeating this 
process r times, we obtain a new solution set by taking a new random loop to track S r to S' r and 
letting S r +i := S r U S' r . Notice that S' r may coincide with S r , but we always have the inclusions 
S r Q <SV+i Q S. We continue constructing random monodromy paths until S r +i = S. To verify 
this last step, we perform a trace test, which we explain next. 


3.2. Trace test. The trace test was introduced in [29] where they use it in a method for 
computing the solutions of a polynomial system using monodromy and to decompose positive 
dimensional reducible varieties. Here, we state the trace test criterion and we illustrate it with 
an example. 

Suppose we have a reduced irreducible 1-dinrensional affine variety X of C n . The intersection 
of X with a generic hyperplane £ consists of deg(X)-many points. The trace of X with respect 
to the hyperplane £ is the point defined by the coordinate-wise sum of the points in £ OX. If 
£ is defined by a linear form Z(x), let Ct be a linear deformation of £ induced by l(x) + 1. The 
trace of Ct depends on t. The main result of [29] says the following. 

Proposition 3.1. With the assumptions and notation above, the trace of X with respect to 
the family of hyperplanes Ct is affine linear in t. Moreover, the coordinate-wise sum of any 
non-empty proper subset of Ct n X is not affine linear in t. 

The idea behind the trace test is that if X is not linear, a linear deformation Ct will not 
deform the points Ct n X linearly, but their trace will. In monodromy methods, the trace test 
is particularly useful to verify that all points of £ n X are found. We illustrate this test with 
the following example. 

Example 3.2. Consider X defined by x\ — X 2 = 0. Let £ be defined by l(x) = 2xi + 4 x 2 — L 
The trace of X with respect to the linear deformation Ct induced by l(x) + t is (—— \t + |). 
However, the nonempty proper subsets of £* n X have traces equaling the points 

^ -l + £-4t+5 —(2t—3W—4t+5 ^ and w—4t+5 -(2t-3)+ v / ^4I+5 ^ 

The results of the forthcoming work [14] give a trace test for multi-projective varieties. Deho- 
mogenizing the multi-projective variety allows us to give a trace test for the fiber of the projection 
p : Y —> C n . Intersecting the variety Y with a linear space of codimension n—1 defined by gen¬ 
eral linear polynomials in u \,..., u n , yields a 1-dimensional subvariety of Y in C n+m X C n . Let 
£ be a bilinear space defined by the product of two general affine linear polynomials h(x) and 
I 2 (u) in the unknowns xi,...,x n and u\,... ,u n respectively. 
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If we define Ct by l\(x)l 2 (u) + t, the trace is affine linear in t for all x,u coordinates. Also, 
for any proper nonempty subset of Ct n Y, the coordinate-wise sum is non-linear in t for at least 
one of the x, u coordinates. 


Example 3.3. In the following example, we examine the previously discussed trace for the 
equations (2.5) that define the total critical variety of X defined by / = 24 X 4 — 2223 and 
objective function \h u = (x\ — u\) 2 + • • • + (24 — U 4 ) 2 . 


12 : R=CC[laml,xl,x2,x3,x4,ul,u2,u3,u4,t] 

—Defines our model X. 

13 : modelEqs={det matrix-[-[xl ,x2},{x3,x4]-]-]- 

—Defines the total critical variety. 

14 : critEqs=modelEqsI{ (xl-ul)-laml*diff(xl,f), (x2-u2)-laml*diff(x2,f), 

(x3-u3)-laml*diff(x3,f), (x4-u4)-laml*diff(x4,f)} 

—Defines a codimension 3 linear space 

15 : linearSpaceU={ul-l ,u2-3,u3+u4-5}- 

—the intersection of linearSpaceU and the total critical variety is a curve. 

1 6 : aCurve=linearSpaceU|critEqs 

—A linear polynomial in the x-coordinates 

17 : Iinearl=2*xl+3*x2+5*x3+7*x4-l 


—A linear polynomial in the u-coordinates 

1 8 : 1inear2=u3+2*u4-7 

—a bilinear polynomial in x,u coordinates. 

19 : Lt=linearl*linear2+t 


Note that the ED-degree of X is equal to the number of points in the intersection of aCurve 
with the variety of linear 2 . When t = 0, the polynomial Lt factors. The ED-degree is the 
number of points of the intersection of Lt with aCurve that are also in the variety of linear2. 

110 : G={Lt]-1aCurve —G is a zero dimensional system when t is specified. 

111 : solsl=bertiniZeroDimSolve(({t}IG),MPTYPE=>2,USEREGENERATI0N=>1); —t=0 

112 : sols2=bertiniZeroDimSolve(({t+.5}|G),MPTYPE=>2,USEREGENERATI0N=>1); —t=-.5 

113 : sols3=bertiniZeroDimSolve(({t+1}IG),MPTYPE=>2,USEREGENERATI0N=>1); —t=-l 

—There are seven solutions. Although the ED-degree is 2. 

114 : #sols1 

ol4 = 7 

—The trace of a set of solutions is the coordinate wise sum of the solution set. 

115 : tracel=sum(solsl/ coordinates) 

11 6 : trace 2 =sum (sols 2 / coordinates) 

117 : trace3=sum (sols3/ coordinates) 

ol7 = {2.29798, .0471255, 6.97206, 20.0172, -10.9275, 7, 21, 


28.7778, 6.22222, -7} 

The trace moves linearly in the x, u coordinates. Therefore, the trace test is verified, if after 
dropping the laml coordinate, the x, u coordinates of the following difference is zero. 

il 8 : drop((tracel-trace2)-(trace2-trace3),1) 

00 I 8 = {-3.33067e-16+1.9984e-15*ii, -1.77636e-15+6.66134e-16*ii, 


1.77636e-14-3.27516e-15*ii, -3.55271e-15+8.32667e-16*ii, 


6.93889e-17*ii, 3.10862e-15*ii, 1.06581e-14-3.77476e-15*ii, 
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-3.5527le-15+3.55271e-15*ii, -4.996e-16*ii} 

3.3. Finding the first critical point. To conclude with our method, we discuss now the way 
we find the first critical point, so that we can run the algorithms discussed in the previous 
sections. The first method uses gradient descent honrotopies that were introduced in m and 
the idea is the following. Let u E C n be fixed parameter values and suppose that xo E X is 
chosen at random, then most likely the vector V\I/ u (xo) will not be in the linear span of the 
columns of the Jacobian matrix .J acX. Thus, not all of the (m+l)-minors of the extended 


Jacobian matrix (2.1) are zero. We use gradient descents and homotopy continuation to track 


xo to a point x* E X where these determinantal conditions are satisfied. 

We assume that we know at least one point xq G X. This is not hard to achieve. For instance, 
we could have a witness set, which is the intersection of X with a general m-dimensional affine 
linear space C. Otherwise, we could find one point in xq G X using cheaters homotopy as follows. 
Let x G C n be chosen at random and let a := F(x) G C m . Most likely, a^O, thus we use the 
following parameter homotopy 


(3.2) 


h(x, t ) := F{x) — (1 — t)a : 


letting t run from 0 to 1, we track x to a point xq in X. 

In general, the starting point xq will not satisfy the polynomial system G(x, A) from (2.5). 
Although F(x o) = 0, the linear combination 

A 0 VT(x;u) + AiV/i(x) 4 -b A m V/ m (x) = K, 

for some nonzero vector K E C n . We consider the following system 


(3.3) 


G K (x, A) : = 


F(x) 

A 0 Vf(x;u) + AiV/i(x) + 


+ A m V/ m (x) - K 


Note that for any K E M n there exists A E P m such that G(xo, A) = 0. We need an appropriate 
K and a homotopy that let K —> 0. We start by defining K := VT u (xo) and the homotopy 


(3.4) H K (x, A, t) := 


F(x) 

A 0 Vf u (x) + AiV/i(x) + • • • + A m V/ m (x) - (1 - t) ■ I< 


If A = [1 : 0 : ... : 0], the point (xo, A) is a solution to (3.4) when t = 0. We use homotopy 
continuation to find a solution (x*, A*) to (3.4) when t = 1. To guarantee that A E P m , we work 
inside the following affine patch of P m : 


(3.5) H%(x,\,t):= 


Ao VT m (x) + AiV/i(x) + 
Aq + oi Ai + 


F(x) 


+ A m V/ m (x) - (1 - t) ■ K 

+ ®mAm — CL o 


where dj E K \ {0} are chosen at random. For the homotopy we start at t = 0 from the 
point (xo, ao, 0,..., 0) and track it to a solution (x*, Ag,..., AJ^) E C n+m+1 for t = 1, where not 
all of the X* are zero. The computed x* will be a critical point and it becomes the starting point 
of our monodromy method. 


4. Implementation and Illustrating Example 

In this section, we report some of the computations we have done, including those that could 
be achieved with our method for the first time. 
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4.1. Planar ellipse. We start by going step-by-step in our method to compute the ED degree 
of the ellipse from Example |1.1| in the Introduction. The following code defines the equations of 
the ellipse. For numerical stability, we start by normalizing the coefficients of the equation. 

11 : R=QQ [xl,x2,ul,u2,Ll] ; 

12 : fModel=1/3000*(1744*xl~2-2016*xl*x2-2800*xl+1156*x2"2+2100*x2+1125) 

218 2 84 289 2 14 7 3 

o 2 =-xl --xl*x 2 +-x 2 -xl + —x2 + - 

375 125 750 15 10 8 

o2 : R 


4.2. Determining the first point. We find one critical point in the ellipse as we explained 
in Section 3.3, by using the cheaters homotopy to find one point in the ellipse, and the gra¬ 
dient descent to track it to a critical point. First, we compute a random point x r £ C 2 and 
define b = f{x r ). We use the cheaters homotopy (3.2) and use the track function from the 
NumericalAlgebraicGeometry package [22] inside Macaulay2, which requires of a squared 
system. We square the system by considering a random line L that passes through x r . We 
built up this function in our library, namely pSlice, which takes a point p and a number a and 
creates a random linear space of dimension a passing through p. 


110 : xrand = flatten entries random(R~2,R~l) —random point in QQ~2 

1 

olO = 1} 

2 

111 : b =sub(fModel,{xl=>xrand_0, x2=>xrand_l]-) —substitue in fModel 

115 : L = pSlice(xrand,1) 

ol5 = {.0857144x1 + .506099x2 - .548957} 
ol5 : List 

116 : p2 = flatten entries random(R~2,R"l); 

117 : L2 = pSlice(p2,1); 

118 : fstart = {fModel-b} I L; 

119 : fend = {fModel} I L2; 

120 : xlnit = track(fstart,fend, {sols}) — xlnit is a point in the ellipse 

121 : xlnit = coordinates first xlnit 

o21 = {.365254+.807261*ii, .165208+.859724*ii} 
o21 : List 


We use the gradient descent homotopy described in (3.5) to track the random point x r to a 
critical point. This function is implemented in our library under the name gradDescHomot, 
which takes a system, an initial point xq £ X and a parameter point u. 


123 : uPoint={0.75, -0.29}; —start data point 

124 : firstCritPt = gradDescHomot({fModel}, xlnit, uPoint); 
o24 = {.598568, -.0941507, .573787, -.498996} 

o24 : List 

125 : xPoint = take(firstCritPt, 2); 
o25 = {.598568, -.0941507} 

o25 : List 

126 : lambdas = take(firstCritPt,-2) 
o26 = {.573787, -.498996} 

o26 : List 

127 : lagPoint = {lambdas_l/lambdas_0} 
o27 = {-.869654} 
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o27 : List 


Thus, we found the point x* = (0.598568, —0.0941507) as a critical point, and our function also 
gives the values of the Lagrange multipliers, so A* = (0.573787, —0.498996) and these form a 
solution to the system H(x, A, 1) defined in (|3.4|). Therefore, in the affine patch when Aq = 1, 


the value of Ai = —0.869654 and the system H(x, A, 1) becomes the system (1.2) from the 
introduction. 


4.3. Performing monodromy. Now that we have one critical point (x*,A*), we compute 
random monodromy paths to find all the complex critical points. 

128 : ourStartPoint=xPointIlagPoint 

o28 - {.598567767156653, -.0941507204851288, -.869654152906697} 
o28 : List 

We call Bertini, so we need to specify the directory where we want Bertini and Macaulay2 store 
the temporary files. We chose the variable theDir to store the string with the directory. 

141 : M=matrix{{xl-ul,x2-u2}}||matrix{{diff(xl,fModel),diff(x2,fModel)}} 

o41 = I xl-ul x2-u2 I 

I 436/375x1-84/125x2-14/15 -84/125x1+289/375x2+7/10 I 
2 2 

o41 : Matrix R <- R 

142 : critEqs=flatten entries ( matrix{{l,L1}}*M) 

436 84 14 84 289 7 

o42 = {-xl*Ll - -x2*Ll + xl - ul - —LI, - -xl*Ll + -x2*Ll + x2 - u2 + — 

375 125 15 125 375 10 

o42 : List 

143 : Eqs={fModel}|critEqs ; 

129 : makeB’InputFile(theDir, 

B’Configs=>{ 

{"MPTYPE",2}, 

{"PARAMETERH0M0T0PY",2}, 

{"USEREGENERATION",1}}, 

AVG=>{{xl,x2,Ll}}, 

PG=>{ul,u2}, 

B’Polynomials=>Eqs); 

i31 : theSolutionsViaMonodromy^’PHMonodromyCollect (theDir, 

MonodromyStartPoints=>{ourStartPoint},—a list of points 
MonodromyStartParameters=>uPoint, 

NumberOfLoops=>100,NumSolBound=>4); 
i34 : importSolutionsFile(theDir, NameSolutionsFile=>"start") 
o34 = {{.252902, -.814004, -5.38676}, {.598568, -.0941507, -.869654}, 

{.83295, -.00662553, -2.09671}, {.844456, -.333067,-.346872}} 

Our procedure took 2 seconds and 18 random loops to find all the four critical points given in 
Example 0 from the Introduction. 

4.4. Performing the trace test. In this section, we outline code that allows us to do a trace 
test. We recall that our model is defined by the equation fModel and the total critical variety 
is defined by Eqs. 
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11 : R=CC [xl,x2,L1,ul,u2,t] 

12 : fModel=1/3000*(1744*xl~2-2016*xl*x2-2800*xl+1156*x2~2+2100*x2+1125) 

13 : M=matrix{{xl-ul,x2-u2}}1|matrix{{diff(xl,fModel),diff(x2,fModel)}} 

14 : critEqs=flatten entries ( matrix{{l,L1}}*M) 

15 : Eqs={fModel}|critEqs; 

We now consider a linear space defined by sliceU of codimension 1. Intersecting the 2 dimen¬ 
sional total critical variety with sliceU produces a curve for which we will perform the trace 
test on. We intersect this curve with Lt which is a bilinear polynomial in the x, u coordinates. 
This intersection consists of 6 points. When t = 0, 4 of the points vanish on 12 and correspond 
to the 4 points we found in the previous subsection. Now we can use these four points as start 
points for a second monodromy computation. This monodromy computation treats t as the 
parameter and the x's, u's, X's as unknowns. 

16 : sliceU=.3*(ul-.75)-.1*(u2+.29) 

17 : 11 = (.2*xl+.3*x2+.5) 

18 : 12 = (ul-.75)+.7*(u2+.29) 
ill : Lt=ll*12+t 

113 : printingPrecision=200 

114 : startPointXLtM. 598567767156653, -.0941507204851288, -.869654152906697, 

.75,-.29} 

115 : EqsAll=-[sliceU,Lt} I Eqs 

116 : makeB’InputFile(theDir, 

B’Configs=>{ 

{"MPTYPE",2}, 

{"PARAMETERH0M0T0PY",2}, 

{"USEREGENERATION",1}}, 

AVG=>{{xl,x2,LI,ul,u2}}, 

PG=>{t}, 

B’Polynomials=>EqsAll); 

117 : solutionsForTraceTest=b , PHMonodromyCollect(theDir, 

MonodromyStartPoints=>-[startPointXLU},—a list of points 
MonodromyStartParameters=>-[0}, 

NumberOfLoops=>50,NumSolBound=>6) 

We perform a trace test to verify we have found all solutions. By deforming t linearly from 0 
to .lq and to .2q, we now have three solutions sets. Storing the traces of these solution sets 
as threeTraces we do the trace test at i25. The output is numerically zero for the x and u 
coordinates showing that the trace is indeed linear. Therefore we conclude that we have found 
all of the points. And because at t = 0 there are 4 of six solutions on 12 we conclude the 
ED-degree is 4. 

118 : gamma=.0177494619790914+.60014762266504*ii 

119 : threeSolutionSets={ solutionsForTraceTest} 

120 : b’PHSequence(theDir,l*gamma}}) 

121 : threeSolutionSets=append(threeSolutionSets, importSolutionsFile(theDir)); 

122 : b’PHSequence(theDir,{{.2*gamma}}) 

123 : threeSolutionSets=append(threeSolutionSets, importSolutionsFile(theDir)); 

124 : threeTraces=for i in threeSolutionSets list sum i 

125 : (threeTraces_0-threeTraces_l)-(threeTraces_l-threeTraces_2) 
oo25 = -[-4.44089209850063e-15+3.88578058618805e-16*ii, 


-4.44089209850063e-15+2.54787510534094e-16*ii 
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-1.24344978758018e-14-3.33066907387547e-15*ii, 


4.44089209850063e-16+4.57966997657877e-16*ii, 


1.77635683940025e-15-7.21644966006352e-16*ii} 


5. Experimental Timings 


We end with experimental timings showing the monodromy method effectively computes 
critical points. Note that these timings were done in serial but can also be performed in parallel. 


5.1. Matrix Models in Statistics. In [15], a polynomial system to determine the critical 
points of likelihood equations is given for matrices with rank constraints. By solving this poly¬ 
nomial system using regeneration methods m, the ML-degrees for various mxn matrices \p t j\ 
with rank at most r were computed. Their results include detraining the following ML-degrees. 


(m, n ) = 

(3,3) 

(3,4) 

(3,5) 

(4,4) 

(4,5) 

(4,6) 

r = 1 

1 

1 

1 

1 

1 

1 

r = 2 

10 

26 

58 

191 

843 

3119 

r = 3 

1 

1 

1 

191 

843 

3119 

r = 4 




1 

1 

1 

wing experimental timings for 

the rows 

labeled * in 

Table 


In [13], the following experimental timings for the rows labeled * in Table [I] were made. We 
include our experimental timings for computing this number of critical points in bold using 
the monodromy method. We see that the monodromy method performs significantly faster in 
determining these critical points. We also include the number of loops that were needed to 
compute these critical points. 


(m, n, r ) 

(3,3,2) 

(3,4,2) 

(3,5,2) 

(4,4,2) 

(4,4,3) 

Polyhedral using PHC* 

4s 

120 s 

2017s 

23843s 

1869s 

Regeneration using Bertini* 

6 s 

61s 

188s 

2348s 

7207s 

Monodromy using Bertini.m2 

4s 

10s 

79s 

322s 

496s 

and ff of monodromy loops 

8 

11 

11 

13 

18 


Table 1. Running times for preprocessing in serial. Rows marked * are from PI- 
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